Mean field and corrections for the Euclidean 
Minimum Matching problem 
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Consider the length Lf IM of the minimum matching of N points in d-dimensional Euclidean space. 
Using numerical simulations and the finite size scaling law (Lfi M ) = (3f. IM (d)N 1 ~ 1 ^ d (l + A/N + - ■ ■), 
we obtain precise estimates of (3f 1M (d) for 2 < d < 10. We then consider the approximation where 
distance correlations are neglected. This model is solvable || and gives at d > 2 an excellent 
"random link" approximation to Pmm{^)- Incorporation of three- link correlations further improves 



the accuracy, leading to a relative error of 0.4% at d 
this expansion in link correlations is discussed. 



2 and 3. Finally, the large d behavior of 



PACS numbers: 75.50.Lk, 64.60. Cn 

There has been a tremendous amount of work on mean 
field calculations for disordered systems in the past 20 
years, in part driven by the exact solution provided by 
Parisi's replica symmetry breaking Ansatz. Although his 
solution was developed in the context of spin glasses, the 
formalism has been extremely useful for understanding 
other disordered systems. Generally, one expects mean 
field to give exact results as the dimensionality goes to 
infinity. One can then ask whether mean field leads to 
"acceptable" errors for systems of interest, e.g., in three 
dimensions, and whether it is possible to compute Eu- 
clidean corrections to the mean field formulae. Such 
computations might correspond to a 1/d expansion for 
the thermodynamic functions of interest. In frustrated 
disordered systems, however, this has turned out to be in- 
tractable. To date, such Euclidean corrections have been 
pushed furthest || for the Minimum Matching Problem 
(MMP) JlCfl . In what follows, we determine the accu- 
racy of the mean field approximation and the effective- 
ness of the corrections thereto in the MMP by comparing 
with the actual properties of the d-dimensional Euclidean 
model. First, we find that the relative error introduced 
by the mean field approximation for the zero tempera- 
ture energy density is less than 4% at d = 2 and 3% at 
d = 3. Second, the inclusion of the "leading" Euclidean 
corrections to the mean field approximation reduces the 
error by a factor of about 10 at d — 2 and d = 3 Third, 
we argue that the large d behavior of systems such as the 
MMP depends on arbitrarily high order correlations and 
is thus beyond all orders of the expansion proposed by 
Mezard and Parisi B. 

Consider N points (N even) and a specified set of link 
lengths lij — Iji separating the points, for 1 < i,j < N. 
One defines a matching (a dimerization) of these points 
by combining them pairwise so that each point belongs 
to one and only one pair. Define also the energy or length 
of a matching as the sum of the lengths of the links asso- 
ciated with each matched pair. The Minimum Matching 
Problem is the problem of finding the matching of mini- 
mum energy. One can also consider the thermodynamics 
of this system, as proposed by Orland and Mezard 
and Parisi H, by taking all matchings but weighting 



them with the Boltzmann factor associated with their 
energy. Here we concentrate on the T — properties be- 
cause there is no effective numerical method for extract- 
ing thcrmodynamical functions in this system, but exact 
energy minima can be obtained quite easily for any given 
instance. Indeed, the MMP belongs to the algorithmic 
class P of polynomial problems, and there are standard 
algorithms which solve any instance of size N using on 
the order of A^ 3 steps Q . 

Physically, one is not interested in the properties of any 
particular instance of the MMP; more relevant are typical 
and ensemble properties such as the average energy when 
the lengths Uj are random variables with a given distri- 
bution. One then speaks of the stochastic MMP. There 
are two frequently used ensembles for the 1^ , correspond- 
ing to the Euclidean MMP and the random link MMP. 
In the first, the N random points lie in a <i-dimensional 
Euclidean volume and the Uj are the usual Euclidean 
distances between pairs of points. The points are inde- 
pendent and identically distributed, so one speaks of a 
random point problem. In the second ensemble, it is the 
link lengths Uj which are independent and identically dis- 
tributed random variables. A connection between these 
two systems was first given by Mezard and Parisi |l : they 
pointed out that the one and two-link distributions could 
be made identical in the two problems. A consequence 
is that the "Cayley tree" approximation for the random 
point and random link problems are the same. Mezard 
and Parisi were able to solve the random link MMP using 
an approach based on replicas Iffi. One may then con- 
sider the random link MMP to be a "mean field model" 
for the Euclidean MMP. The mean field approximation 
consists of using the thermodynamic functions of the ran- 
dom link model as estimators for those of the Euclidean 
model. This approximation is applicable to all link based 
combinatorial optimization problems, such as the assign- 
ment problem and the traveling salesman problem; here- 
after we shall refer to it as the random link approximation 
H . Finally, Mezard and Parisi have shown how to derive 
corrections systematically to the random link approxima- 
tion using a connected-correlation link expansion. In @], 
they have computed the leading corrections, these be- 
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ing associated with the triangle inequality (3-link corre- 
lations) in the Euclidean model. How accurate are these 
approximations? To answer this, we first give our results 
for the Euclidean problem, and then compare with the 
predictions of the random link approximation and of the 
link expansion method. 

In the Euclidean MMP, let Lf IM be the energy or 
length of the minimum matching. Taking the points to be 
independent and uniformly distributed in a unit volume, 
Steele @ has shown that as N -> oo, Lf IAI /N 1 ' 1 ^ 
converges with probability one to a non-random, N- 
independent constant /3^f M (d). In physics language, this 
result shows that Lf IAI is self- averaging and that the 
zero-temperature energy density has an infinite volume 
limit when the density of points is kept fixed. To date, 
little has been done to compute f3f. rM (d). The best es- 
timates seem to be due to Smith (llj: f3f {M (2) rs 0.312 
and 0mm (3) ~ 0.318. Let us use a systematic procedure 
Pj to obtain fif IM {d) with quantifiable errors. First, in 
order to have a well defined dependence on TV, we have 
used the ensemble average, (Lf /M ) /iV 1 ~ 1 / <i . Second, in 
order to reduce corrections to scaling in the extrapola- 
tion to the large N limit, we have placed the points ran- 
domly in the d-dimensional unit hypercube with periodic 
boundary conditions. This removes surface effects and 
empirically leads to the finite size scaling law 

( L mm) 3 e mi A{d) B{d) 

N i-i/ d ~ pmmWU + + -j^r + ■■■)■ uj 

Finally, in order to reduce statistical fluctuations, we 
have used a variance reduction trick p|. The improved 
estimator has the effect of reducing the variance of our 
estimates by more than a factor 4, and thus saves us 
a considerable amount of computer time. Using these 
methods, we have extracted from our numerical data 
(3f IA Ad) and its associated statistical error. The fits to 
Eq.(nl) are good, with x 2 values confirming the form of 
the finite size scaling law. The error bars on the extrap- 
olated value (3^f M (d) are obtained in the standard way 
by requiring that x 2 increase by one from its minimum. 
We find in particular /3f /M (2) = 0.3104 ± 0.0002, and 
0mm{3) ~ 0.3172 ± 0.00015; values at higher dimensions 
are given in Table Q. We have checked that these results 
are not significantly modified when using another random 
number generator to produce the instances, and that the 
fits are stable to truncation of the data. 

Now we discuss how to use the random link model to 
approximate (3f IM (d). For any two points placed 
at random in the unit d-dimensional hypercube, the den- 
sity distribution of lij is given at short distances by 
Pd(hj = r) = dB d r d -\ where B d = Ti d / 2 /{d/2)\ is the 
volume of the d-dimensional ball with unit radius. If we 
take the random link model where link lengths are inde- 
pendent and have the individual distribution Pd{l), then 
the Euclidean and random link MMP have the same one 
and two-link distributions H because two Euclidean dis- 
tances are independent. If correlations among three or 



more link lengths are weak, then the properties of the two 
systems should be quantitatively close. Thus an analytic 
approximation to (3^ M (d) is obtained by computing its 
analogue (3^ L M (d) in the random link MMP. In references 
H[|, Mezard and Parisi solved these random link mod- 
els under the replica symmetry hypothesis. They showed 
further |tJ that the replica symmetric solution is stable 
(at least for d — 1), and thus is most likely exact un- 
less a first order phase transition occurs in this system. 
Their solution gives (3^f L M (d) in terms of a function Gd 
related to the probablility distribution of link lengths for 
matched pairs. In our Euclidean units their result can be 
written 

& M (d) = ^ T^yy f2 G d {x)e- G ^d x (2) 
where Gd satisfies the integral equation 

/+oo 
{x + y) d - l e- G ^dy (3) 
-X 

and where 

D l {d) = lim (LJ/N 1 - 1 ^ = {l/d)\B- 1/d (4) 

is the average (rescaled) link length of the nearest neigh- 
bor graph in the limit N — > oo. 

Brunetti et al. [0] have used direct numerical simula- 
tions of these random link models to confirm the predic- 
tions to the level of 0.2% at d = 1 and 2, and we have 
done the same to the level of 0.1% at 1 < d < 10, giving 
further evidence that the replica symmetric solution is ex- 
act. From the analytical side, solving the integral equa- 
tion for G d leads to f3^ L M (l) = tt 2 /24 = 0.4112335..., 
pRf u (2) = 0.322580. . . , and (3^ L M (3) = 0.326839. . . ; val- 
ues at higher dimensions are given in Table |. If we con- 
sider flf'fhiid) as a mean field prediction for f3f IM (d), the 
accuracy is surprisingly good. Including the trivial value 
/3£u(l) = 0.5, we see that the random link approxi- 
mation leads to a relative error of 17.8% at d = 1, of 
3.9% at d = 2, and of 3.0% at d = 3. Also, the error 
decreases with increasing dimension. It can be argued, 
for the MMP as well as for other link-based combina- 
torial optimization problems Q, that the random link 
approximation not only has a relative error tending to- 
wards as d — * oo, but that in fact this error is at most 
of order 1/d 2 . Given our high quality estimates, we are 
able to confirm this property numerically. In Figure |l| 
we plot the quantity d((3^j L M - Pmm)/Pmm alon S with 
a quadratic fit given to guide the eye. As expected, the 
data scales as 1 / d. Thus the random link approximation 
gives both the leading and 1/d subleading dependence 
of Pmm(^)- I n or der to obtain analytic expressions for 
these coefficients, we have derived the 1/d expansion for 
Pm"m horn Eqs.(H||). We used two methods to do this. 
The first, straightforward but computationally lengthy, 
consists of setting Gd(x) — Gd(x — x/d+ 1/2) and then 
writing Gd{x) as a power series in 1/d. From this we find 
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& M (d) = ^f i ( 1 + L ^ L + O^/d 2 )) ( 5 ) 

where 7 = 0.577 ... is Euler's constant. If, as claimed, the 
random link approximation gives an error of order 1/d 2 , 
Eq.([5]) gives an analytic expression for the leading and 
first subleading terms in the 1/d expansion of f3f IM (d). 
This claim is strongly supported by the numerical results: 
performing a fit of our (3 MM (d) values to a truncated 1 jd 
series leads to 0.424 ± 0.008 for the coefficient of the 1/d 
term; this is to be compared to the theoretical prediction 
of 1-7 = 0.422784... 

We have been able to obtain the next coefficient of the 
series in 1/d for 0mm by using a second method. We 
introduce a modified random link model where the links 
are shifted and rescaled in such a way that the leading 
term of the 1/d expansion for this new model is exactly 
the 1/d coefficient for the initial one || . In fact it is pos- 
sible to introduce a sequence of such "rescaled" models, 
where the k th model is designed to produce the l/d k term 
of the expansion. We have computed the leading terms 
predicted by a replica symmetric analysis of these models 
for k = 1 and 2, from which we find that the order 1/d 2 
coefficient in Eq.(|) is tt 2 /12 + 7 2 /2 - 7. 

We now come to the final point of the paper: how 
well can one predict (3f IM (d) by incorporating Euclidean 
corrections to the random link approximation? It is nec- 
essary here to review the work of Mezard and Parisi; 
for greater detail, we refer the reader to their article 
H . They begin with the partition function Z for an 
arbitrary stochastic MMP and write the quenched aver- 
age for n replicas. In the Euclidean model, the Uj have 
three and higher-link correlations. Mezard and Parisi 
keep the three-link correlations (arising only when the 
three links make a triangle) and neglect higher connected 
correlations. Note that it is not clear a priori whether 
these "higher order" terms are negligeable compared to 
the three-link term. The resulting expression for the 
quenched average becomes 




x e E ( , J) ^+E;, J)(tl) (™ n )^ , ' t ' u ™" c (6) 

where tiy is a complicated nonlinear function of the link 
length lij. They then compute the limit N — > 00, n — > 
using the saddle point method while assuming that 
replica symmetry is not broken. In the zero temperature 
limit, just as in the standard random link model, the 
saddle point equations can be expressed in terms of Gd, 
but now Gd satisfies a more complicated integral equation 
(Eq.(34) in their paper). From this, one can calculate 
new estimates for f3 MM (d) , which we shall denote (3j/f M , 
where EC stands for Euclidean corrections. 

We have solved the equations numerically for this mod- 
ified Gd, and have computed (iff M (d) for 2 < d < 10. We 
find (3 M C M (2) = 0.30915 and P M C M (3) = 0.31826. The re- 
sults for d > 4 are given in Table |. Comparing with 



Pmm (d) and P M L M (d) , we see that the new estimates are 
considerably more accurate. At d = 2, the random link 
approximation leads to an error of 3.9%; this error is 
decreased by nearly a factor 10 by incorporating these 
leading Euclidean corrections. Similarly at d = 3, the 
error is reduced from 3.0% to less than 0.4%. At larger 
d, the error continues to decrease, but the effect is less 
dramatic. 

To interpret this last result, consider how the differ- 
ence Pff M — 0mm sca l es with d. Using Eq.(||), we see 
that it is sufficient to estimate the size of the 3-link cor- 
rection term. Its dimensional dependence follows that 
of the probability of finding nearly equilateral triangles 
as d — > 00. Since this probability goes to zero exponen- 
tially with d, the 3-link correlations give tiny corrections 
at large d (as confirmed by the numreics), and also the 
power series expansion in 1 jd of j3f^ M is identical to that 
of ■ T ms P r0 P er ty continues to hold if one includes 
4, 5, or any finite number of multi-link correlations in 
Eq.(^J). This is due to the fact that the Euclidean and 
random link graphs have local properties that are identi- 
cal up to exponentially small terms in d. In particular, 
the statistics of fixed sized (A-independent) loops con- 
necting near neighbors are nearly identical. 

Although this reasoning was given for the MMP, it 
applies equally well to other link-based problems. In 
such statistical mechanics systems, if the thermodynamic 
functions depend only on the local properties of the 
(short) link graph, then the random link approximation 
applied to the Euclidean system will have an error which 
is exponentially small in d. However, for combinato- 
rial optimization problems such as the MMP, the assign- 
ment problem, and the traveling salesman problem, the 
N —* 00 limit and the fc-link expansion do not commute: 
fc-link correlations with k growing with N remain impor- 
tant as A — > 00. In particular, arbitrarily large loops 
matter and contribute to the thermodynamics at order 
1/d 2 . In a polymer picture, we can say that the random 
link approximation is exponentially good in the dilute 
phase, while it leads to 1/d 2 errors in the collapsed phase. 
The power corrections in this phase are beyond all orders 
in a /c-link correlation expansion such as Eq.(|J). 

In summary, we have estimated by numerical simula- 
tion f3f IM (d), the zero energy density in the Euclidean 
Minimum Matching problem at dimensions 2 < d < 10. 
We have then computed two analytical estimates for 
these energy densities, namely (3 M L M (d) and (3 M c M (d). 
The first method uses the random link approximation 
where all link correlations are neglected. Using the "ex- 
act" mean field solution of Mezard and Parisi, we find 
that even at low dimensions, the error introduced by 
this approximation is small: 3.9% at d = 2, 3.0% at 
d = 3, and 2.0% at d = 4. In the second method, 
the connected three-link correlations are taken into ac- 
count while higher ones are neglected. Using Mezard 
and Parisi's expressions, we find that this modification 
to the random link model gives excellent predictions at 
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d = 2 and 3, with the error there being divided by al- 
most 10 compared to the random link approximation. 
This provides a stringent quantitative test of a system- 
atic expansion which goes beyond uncorrelated disorder 
variables, and suggests that even the leading such cor- 
rection is enough to get predictions for thermodynamic 
functions precise to better than one percent. Finally, at 
high dimensions, we have seen that the fc-link correla- 
tion expansion leads to corrections which vanish expo- 
nentially with d; this expansion thus misses important 
1 jd power law corrections for problems such as the MMP. 
This leaves open the determination of the 1 jd 2 term in 
the expansion of the constants (3f IM (d). We have per- 
formed a fit on our data, imposing the leading and the 
1/d term to be those of the random link model. We find 
the 1/d 2 coefficient to be very small (smaller than 0.01 
in absolute value). Clearly, it would be of major interest 
to obtain an analytical value for this term. 
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TABLE I. Comparison of the MMP constants for the three 
models: Euclidean, random link, and random link including 
3-link Euclidean corrections (4 < d < 10). 



d 


P E (d) 


f3 RL (d) 


a 0JS 


P EC (d) 




4 


0.3365+ 0.0003 


0.343227 


+0.080 


0.33756 


+0.30% 


5 


0.3572± 0.00015 


0.362175 


+0.070 


0.35818 


+0.27% 


6 


0.3777+ 0.0001 


0.381417 


+0.059 


0.37849 


+0.21% 


7 


0.3972± 0.0001 


0.400277 


+0.054 


0.39807 


+0.22% 


8 


0.4162+ 0.0001 


0.418548 


+0.045 


0.41685 


+0.17% 


9 


0.4341± 0.0001 


0.436185 


+0.042 


0.43485 


+0.17% 


10 


0.4515± 0.0001 


0.453200 


+0.037 


0.45214 


+0.14% 
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